feat: add Gemini embedding provider for codebase indexing #5228

SannidhyaSah · 2025-06-29T06:45:18Z

Description

This PR adds Google Gemini as a new embedding provider for codebase indexing, expanding the available options beyond OpenAI and Ollama. The implementation leverages Gemini's text-embedding-004 model through their OpenAI-compatible API endpoint.

Technical Architecture Decision

Why OpenAI-Compatible API Instead of Native Gemini Client?

We implemented GeminiEmbedder as a wrapper around OpenAICompatibleEmbedder rather than using the native Gemini/GenAI client due to significant performance limitations:

~10x Performance Improvement: The native Gemini/Gen Ai client has inherent limitations that make embedding generation incredibly slow
Better Rate Limiting: OpenAI-compatible endpoint handles batching and rate limiting more efficiently
Improved User Experience: Prevents frustrating delays during codebase indexing
Proven Reliability: Leverages the battle-tested OpenAI-compatible infrastructure

This architectural choice prioritizes user experience while maintaining full compatibility with Gemini's powerful text-embedding-004 model.

Changes Made

Backend Implementation

New GeminiEmbedder class (src/services/code-index/embedders/gemini.ts) - Optimized wrapper around OpenAICompatibleEmbedder with Gemini-specific configuration:
- Base URL: https://generativelanguage.googleapis.com/v1beta/openai/
- Model: text-embedding-004
- Dimension: 768
Updated service factory (src/services/code-index/service-factory.ts) - Added Gemini provider support
Updated config manager (src/services/code-index/config-manager.ts) - Added Gemini configuration handling
Updated type definitions (packages/types/src/global-settings.ts, packages/types/src/codebase-index.ts) - Added Gemini provider types

Frontend Implementation

Updated CodeIndexSettings component (webview-ui/src/components/settings/CodeIndexSettings.tsx) - Added Gemini to provider dropdown
Simplified configuration - Only requires API key input (base URL and model are pre-configured for optimal performance)

Internationalization

Complete i18n support - Added "Gemini" translations across all 18 supported languages:
- Catalan, German, English, Spanish, French, Hindi, Indonesian, Italian, Japanese, Korean, Dutch, Polish, Portuguese (Brazil), Russian, Turkish, Vietnamese, Chinese (Simplified), Chinese (Traditional)

Testing

Comprehensive unit tests (src/services/code-index/embedders/__tests__/gemini.spec.ts) - Tests for GeminiEmbedder functionality
Updated existing tests - Modified service factory and config manager tests to include Gemini provider

Testing

All existing tests pass
Added comprehensive unit tests for GeminiEmbedder
Updated service factory and config manager tests
Manual testing completed:
- Verified Gemini appears in provider dropdown
- Confirmed API key input field works correctly
- Tested provider switching functionality
- Validated performance improvements over native client approach

Verification of Acceptance Criteria

Based on issue #4524 acceptance criteria:

Users can access Experimental section - Codebase indexing is in experimental settings
Users can enable Codebase indexing - Feature toggle works as expected
Users can select Gemini as Embedding Provider - Gemini now appears in dropdown
Simple configuration - Only API key required, model and endpoint are pre-configured for optimal performance

Related Issues

Closes Codebase Indexing: Add Google Gemini as provider #3967 - Primary request for Gemini embedding support
Closes Enhancement of Gemini Integration in Roo Code #4524 - Duplicate issue with detailed acceptance criteria
Related to Feat codebase indexing add gemini as provider #3971, Finer-grained control of Gemini models and Enhancement of Gemini Integration in Roo Code #4483, Getting code indexing error when using litellm with vertexai gemini-embedding-001 model #5123, Codebase Indexing with Azure OpenAI: API version missing #5212, feat: Allow custom embedding model configuration for Codebase Indexing #3974 - Part of broader Gemini integration efforts

User Experience

Users can now:

Navigate to Settings → Experimental → Codebase Indexing
Select "Gemini" from the Embedding Provider dropdown
Enter their Gemini API key
Start indexing their codebase with Gemini's text-embedding-004 model

The implementation provides a seamless experience with automatic configuration of Gemini's optimal settings and significantly improved performance compared to the native client approach.

Performance Benefits

Fast Embedding Generation: ~10x faster than native Gemini client
Efficient Batching: Optimized request handling through OpenAI-compatible endpoint
Reduced Latency: Better rate limiting and connection management
Improved Reliability: Proven infrastructure for production workloads

Checklist

Code follows project style guidelines
Self-review completed
Comments added for complex logic
No breaking changes
All translations added for i18n support
Comprehensive test coverage
Follows existing embedding provider patterns
Performance optimizations implemented

Important

Adds Gemini embedding provider for codebase indexing using OpenAI-compatible API, with backend, frontend, i18n, and testing updates.

Behavior:
- Adds GeminiEmbedder class in gemini.ts using OpenAI-compatible API for codebase indexing.
- Supports text-embedding-004 model with fixed dimension 768.
- Updates service-factory.ts and config-manager.ts to include Gemini provider.
Frontend:
- Updates CodeIndexSettings.tsx to add Gemini to provider dropdown.
- Simplifies configuration to require only API key input.
Internationalization:
- Adds translations for "Gemini" in 18 languages.
Testing:
- Adds unit tests in gemini.spec.ts.
- Updates tests in config-manager.spec.ts and service-factory.spec.ts for Gemini support.

^{This description was created by}^{for dfc844e. You can customize this summary. It will automatically update as commits are pushed.}

- Changed VSCodeTextField to use placeholder instead of default value - Users can now clear the field without it auto-filling - Maintains backward compatibility with existing configurations - All existing tests continue to pass

shariqriazz · 2025-06-29T06:55:08Z

You might wanna adjust MAX_TOKENS_PER_TEXT: 2048 for this model

- Fix inconsistent logic in CodeIndexSettings where Gemini was incorrectly included in OpenAI model fallback - Add clarifying comment for hardcoded Gemini embedding dimension (768) in service-factory

…splay - Add GEMINI_MAX_ITEM_TOKENS constant (2048) for text-embedding-004 model - Update OpenAICompatibleEmbedder to accept configurable maxItemTokens - Pass token limit from GeminiEmbedder to OpenAICompatibleEmbedder - Fix UI to properly display Gemini model in dropdown - Update tests to handle new maxItemTokens parameter

daniel-lxs

Thank you @SannidhyaSah

LGTM

SannidhyaSah and others added 4 commits June 28, 2025 18:51

fix: set default Qdrant URL and handle empty input on blur

3f06da5

Merge branch 'RooCodeInc:main' into fix/qdrant-url-placeholder

64442ec

feat: add Gemini embedding provider for codebase indexing

dfc844e

SannidhyaSah requested review from cte, jr and mrubens as code owners June 29, 2025 06:45

github-project-automation bot moved this to New in Roo Code Roadmap Jun 29, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jun 29, 2025

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jun 29, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jun 29, 2025

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jun 29, 2025

This was referenced Jun 29, 2025

Codebase Indexing: Add Google Gemini as provider #3967

Closed

Enhancement of Gemini Integration in Roo Code #4524

Closed

SannidhyaSah mentioned this pull request Jun 29, 2025

Feat codebase indexing add gemini as provider #3971

Closed

24 tasks

SannidhyaSah added 2 commits June 29, 2025 12:44

fix: improve Gemini model configuration consistency

0302e16

- Fix inconsistent logic in CodeIndexSettings where Gemini was incorrectly included in OpenAI model fallback - Add clarifying comment for hardcoded Gemini embedding dimension (768) in service-factory

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 29, 2025

hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jun 29, 2025

daniel-lxs approved these changes Jul 1, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 1, 2025

daniel-lxs moved this from PR [Needs Prelim Review] to PR [Needs Review] in Roo Code Roadmap Jul 1, 2025

hannesrudolph added PR - Needs Review and removed PR - Needs Preliminary Review labels Jul 1, 2025

mrubens approved these changes Jul 3, 2025

View reviewed changes

mrubens merged commit 87aa688 into RooCodeInc:main Jul 3, 2025
20 checks passed

github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Jul 3, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Jul 3, 2025

SannidhyaSah deleted the geminiembedding-provider branch July 12, 2025 05:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Gemini embedding provider for codebase indexing #5228

feat: add Gemini embedding provider for codebase indexing #5228

Uh oh!

SannidhyaSah commented Jun 29, 2025 •

edited

Loading

Uh oh!

shariqriazz commented Jun 29, 2025

Uh oh!

daniel-lxs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat: add Gemini embedding provider for codebase indexing #5228

feat: add Gemini embedding provider for codebase indexing #5228

Uh oh!

Conversation

SannidhyaSah commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Technical Architecture Decision

Changes Made

Backend Implementation

Frontend Implementation

Internationalization

Testing

Testing

Verification of Acceptance Criteria

Related Issues

User Experience

Performance Benefits

Checklist

Uh oh!

shariqriazz commented Jun 29, 2025

Uh oh!

daniel-lxs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

SannidhyaSah commented Jun 29, 2025 •

edited

Loading